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Abstract 

Background: Until recently, genome-wide association studies (GWAS) iiave been restricted to research groups with 
the budget necessary to genotype hundreds, if not thousands, of samples. Replacing individual genotyping with 
genotyping of DNA pools in Phase I of a GWAS has proven successful, and dramatically altered the financial 
feasibility of this approach. When conducting a pool-based GWAS, how well SNP allele frequency is estimated from 
a DNA pool will influence a study's power to detect associations. Here we address how to control the variance in 
allele frequency estimation when DMAs are pooled, and how to plan and conduct the most efficient well-powered 
pool-based GWAS. 

Methods: By examining the variation in allele frequency estimation on SNP arrays between and within DNA pools 
we determine how array variance [var(earray)] and pool-construction variance [var(econstruction)] contribute to the 
total variance of allele frequency estimation. This information is useful in deciding whether replicate arrays or 
replicate pools are most useful in reducing variance. Our analysis is based on 27 DNA pools ranging in size from 
74 to 446 individual samples, genotyped on a collective total of 128 lllumina beadarrays: 24 1 IVl-Single, 32 IM-Duo, 
and 72 660-Quad. 

Results: For all three lllumina SNP array types our estimates of var(earray) were similar, between 3-4 x lO"'^ for 
normalized data. Var(econstruction) accounted for between 20-40% of pooling variance across 27 pools in normalized 
data. 

Conclusions: We conclude that relative to var(earray), var(econstruction) is of less importance in reducing the variance 
in allele frequency estimation from DNA pools; however, our data suggests that on average it may be more 
important than previously thought. We have prepared a simple online tool, PoolingPlanner (available at http:// 
www.kchew.ca/PoolingPlanner/), which calculates the effective sample size (ESS) of a DNA pool given a range of 
replicate array values. ESS can be used in a power calculator to perform pool-adjusted calculations. This allows one 
to quickly calculate the loss of power associated with a pooling experiment to make an informed decision on 
whether a pool-based GWAS is worth pursuing. 



Background 

Genome-wide association studies (GWAS) have been 
used to examine over 200 diseases and traits, and identi- 
fied over 4000 single nucleotide polymorphisms (SNPs) 
associated with these traits, as listed in the Catalog of 
Published Genome-Wide Association Studies [1]. In 
many cases, GWAS have revealed previously 
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unsuspected molecular mechanisms of disease, high- 
lighting the value of this hypothesis-free approach 
[reviewed in [2,3]]. Unfortunately, GWAS are very costly 
due to the price of genotyping thousands of individual 
DNA samples on high-density SNP arrays. Conse- 
quently, GWAS have only been feasible for research 
groups with the necessary budget, studying well-funded 
diseases or traits. A simple strategy to drastically reduce 
cost is to replace individual genotyping in Phase I of a 
GWAS with genotyping of DNA pools. DNA pools yield 
estimated allele frequencies rather than observed 
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genotypes; hence, this step has been called allelotyping 
[4]. Several studies have provided proof of principle for 
the pooling strategy, using it to re-discover known dis- 
ease-variant associations of moderate to large effect size 
for a fraction of the cost of conventional GWAS [5,4]. 
To date, more than twenty pooled-based GWAS have 
been published, many reporting genome-wide significant 
associations for diseases and traits such as follicular 
lymphoma, otosclerosis, multiple sclerosis, Alzheimer's 
disease, melanoma, psoriasis, and skin colour [6-12]. 
Depending on the number of samples being pooled, the 
cost reduction in Phase I can easily reach 100 fold. Con- 
sider, if a SNP array costs $250 and there are 2000 cases 
and 2000 controls to genotype, a million dollars is 
required for Phase I individual genotyping alone. Con- 
versely, the pool-based experiment using 12 replicate 
arrays on two pools (case and control) would be $6000, 
or 0.6% of the cost. Simply put, a pooling GWAS is fea- 
sible for most grant budgets, while an individual geno- 
typing GWAS is not. The criticism of pool-based 
GWAS is that they have reduced power relative to con- 
ventional GWAS because of errors introduced by esti- 
mating allele frequency from DNA pools rather than 
individual genotyping data. While it is true that pool- 
based GWAS forfeit some power, these losses can be 
estimated, are often less than expected, and may not 
change the associations discovered. Although array costs 
will continue to drop and conventional GWAS will 
become more feasible, the potential savings associated 
with the pooling approach will scale in proportion, leav- 
ing more funds for subsequent replication, fine-map- 
ping, and sequencing of associated genomic regions. For 
diseases or traits with unknown biology or genetic invol- 
vement, a pooling GWAS represents an economical way 
to test for associations with moderate odds ratios. In 
addition, work using DNA extracted from pooled whole 
blood suggests that a large time-savings (50-100 fold) 
may also be possible, presenting the possibility of an 
incredibly fast (<1 month) and economical experiment 
[5]. For a comprehensive introduction and review of 
DNA pooling readers are directed to Sham et al. 2002 
and Pearson et al. 2007 [13,4], and for a set of best 
practices for any GWAS to Pearson & Manolio, 2008 
[14]. 

We know that in the process of estimating allele fre- 
quencies from DNA pools we introduce error, and these 
must be taken into consideration to plan an adequately 
powered experiment or to appropriately calculate asso- 
ciation statistics [15,16]. With respect to doing this, the 
most important consideration is the pooling variance 
[17]; the variance in the errors arising from estimating 
allele frequency from a DNA pool. Pooling variance is 
the sum of many sources of variation, including in parti- 
cular, array variance and pool construction variance. 



Array variance can be attributed to those errors arising 
from estimating allele frequency from a DNA pool on 
an SNP array [17,18]. Pool construction variance can be 
attributed to those errors arising from the physical crea- 
tion of a DNA pool. As pooling variance increases, the 
ability of a pool-based GWAS to detect odds ratios simi- 
lar to those detectable by conventional GWAS 
decreases. In this report we assume pooling variance is 
the sum of array variance and pool-construction var- 
iance and attempt to determine which makes the greater 
contribution to the pooling variance. This is relevant to 
determining how best to design a pool-based GWAS 
and how to allocate resources, for example, replicate 
arrays can be used to reduce array variance and/or 
pools can be constructed in replicate to control pool 
construction variance. 

Here we partition and estimate variance components 
using the approach described by MacGregor [17], which 
examines variation in allele frequency measurements 
between and within DNA pools. Briefly, within-pool var- 
iation is that observed between two arrays used to alle- 
lotype the same DNA pool (i.e. replicate arrays), and is 
an estimate of array variance. Between-pool variation is 
that observed between two arrays used to allelotype two 
different DNA pools, and is an estimate of pooling var- 
iance. Estimates of array variance and pooling variance 
are used to calculate pool construction variance by sub- 
traction [17]. Using this approach in an analysis of two 
DNA pools allelotyped on twelve Affymetrix Genechip 
Hindlll arrays (6 arrays per pool) MacGregor [7] found 
that approximately 87.5% of pooling variation could be 
attributed to the arrays, leaving 12.5% to pool-construc- 
tion [17]. It was noted, however, that more data sets 
would be necessary to determine the variability in these 
estimates. Here we inspect 27 DNA pools allelotyped on 
a total of 128 Illumina arrays, including the HumanlM 
Single (IM-Single), HumanlM Duo (IM-Duo), and 
HumanHap660 Quad (660- Quad) arrays, allowing us to 
better address the question of what values array variance 
and pool-construction variance are likely to take. In 
addition, we perform our analysis on normalized array 
data and raw array data to examine how normalization 
affects pooling variance estimates. 

In the first part of this study we establish values for 
array variance and pool-construction variance. In the 
second part, we use these estimates to calculate the 
effective sample size (ESS) of a DNA pool (where ESS is 
the equivalent number of samples that would need to be 
individually genotyped to give a similar result) [19]. We 
also present a simple online tool, PoolingPlanner, which 
uses our empirical variance estimates as default values 
to calculate the effective sample size (ESS) of a DNA 
pool given a range of replicate array values (available at 
http://www.kchew.ca/PoolingPlanner/). PoolingPlanner 
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also accepts user-supplied values for variance estimates. 
ESS can then be used in one of the available power cal- 
culators, such as CaTS [20], or Quanto [21], to perform 
pool-adjusted power calculations [4]. PoolingPlanner is 
intended to help researchers quickly calculate the loss of 
power associated with a particular pooling experiment, 
which is a first step in making on informed decision on 
whether a pool-based GWAS is worth pursuing. 

Methods 

Data 

Our analysis is based on 27 DNA pools ranging in size 
from 74 to 446 individual samples. These were allelo- 
typed on a collective total of 128 lUumina beadarrays: 
24 IM-Single, 32 IM-Duo, and 72 660-Quad. Our data- 
set comprises four batches of genotyping (details given 
in Additional File 1, Table SI), which correspond to 
four ongoing pool-based GWAS that have not yet been 
published. Each of these studies was approved by the 
joint Clinical Research Ethics Board of the British 
Columbia Cancer Agency and the University of British 
Columbia. All subjects gave written informed consent. 

Genomic DNA was extracted from peripheral venous 
blood collected between 2001 and 2008 by different 
laboratories using different methods. DNA samples were 
diluted to 50-100 ng/uL and then quantified in duplicate 
by fluorometry using PicoGreen™(Molecular Probes, 
Eugene, OR, US). Pools were constructed by combining 
200 ng of each sample DNA by manual pipetting. Pools 
were assayed (allelotyped) at the Centre for Applied 
Genomics at Sick Children's Hospital in Toronto." 

SNP allele frequency in DNA pools was estimated 
using lUumina's beadarrays, where on average each SNP 
is estimated by 16-18 "bead" observations per array (oli- 
gonucleotide probes are designed to assay a SNP and 
attached to beads, where individual beads are coated 
with one probe type and interrogate one site in the gen- 
ome) [22]. Equation 1 was used in the calculation of 
each SNP allele frequency: 

Pi=i...„ = -T —-- 1 

G,+ Ri 

where G, and R, are the green and red fluorescence 
intensity for the ith bead assaying a given SNP. The two 
colours correspond to the two alleles of the SNP, and « 
is the number of beads assaying a given SNP, typically 
16-18. Illumina beadarrays are manufactured such that 
there are multiple strips on each array [22], and our 
preliminary analysis revealed that unique groups of 
SNPs are consistently on only a subset of strips. From 
our previous experience, and that of others [18], it was 
known that the average relative intensity of the red and 
green channels could differ dramatically between strips 



and between arrays. To prevent these manufacturing 
and/or assaying properties from biasing allele frequency 
estimation, a simple normalization was performed. Each 
array was normalized on a strip-by-strip basis by adjust- 
ing the red channel intensity to give a mean strip-wide 
allele frequency estimate of 0.5 [18]. To examine the 
effect of this normalization on the variance terms esti- 
mated, the analyses presented in this paper are per- 
formed on both normalized and raw Illumina array 
data. 

Statistical Analysis 

Our purpose is to calculate empirical estimates of pool- 
ing variance and array variance, and then to estimate 
pool construction variance by subtraction. Pooling var- 
iance and array variance are both estimated by calculat- 
ing allele frequency differences across two paired (by 
SNP, for all SNPs on the array) arrays [17]. The two 
arrays used in the comparison will dictate whether an 
estimate of array or pooling variance is generated. For 
example, to calculate array variance, let allele frequency 
estimates on arrays x used to allelotype DNA pool a be: 

pax — pa ^arrayjx 

where is the true allele frequency for those samples in 
DNA pool a, and earray_x is the error associated with 
estimating the allele frequency from a DNA pool [15]. 
Then, the variance of the allele frequency difference on 
two replicate arrays [x = 1, 2) is [17]: 

var [pal - pal) = var (p,, + eanay.l - pa - ^array^) 
= var [eanayA ~ Sarrayjl) 
= 2var {earray) 

This yields an estimate of array variance: 

var {earray^ = Var {pal - pal)/^ 

where var (p^i — pai) is calculated as the average of 
the squared allele frequency differences for all SNPs, / (/ 
= l...n), on arrays 1 and 2: 

1 " 

var (pal - pal) = ^ {paU - Pal.if 

Var{earray) IS assumed constant for all SNPs. If more 
than two replicate arrays are used to allelotype a given 
DNA pool, multiple array comparisons are possible, and 
the best estimate of var(earray) is the average of all possi- 
ble pairings [17]. 

If arrays 1 and 2 interrogate two different DNA pools, 
an estimate of pooling variance can be obtained. When 
two DNA pools (a, b) are constructed from identical 
samples (i.e replicate pool construction). 



Earp ef al. BMC Medical Genomics 201 1, 4:81 
http://www.biomedcentral.eom/1755-8794/4/81 



Page 4 of 1 3 



var (pal - phi) = 2var [eanay) + 2var [ecomtruaion) 

where V2ir{econstrucUoi^ is the variance in the pool con- 
struction errors, which are assumed to be constant for 
all SNPs. Thus, an estimate of pooling variance, var 
[17]: 

var [epooiing-i] = var (p^i - Phi) I '2- 

where "pooling- 1" is used to indicate that this estimate 
of pooling variance is based on the comparison of arrays 
that allelotype two replicate DNA pools. As before, if 
more than two replicate arrays are used to allelotype a 
given DNA pool, multiple array comparisons are possi- 
ble, and the best estimate of var(epoo//„^.i) is the average 
of all possible pairings [17]. 

When DNA pools a and b are constructed from non- 
identical samples (ex. a case and control pool), an alter- 
native estimate of pooling variance is var(epoo/,„g.2) 
[15,17]: 

var {epooiing-i) = [var [pai - phi)- ^a\.bi\ /2 

Here var [pa\ — phi) is calculated as the average of 
the squared allele frequency difference minus a random 
binomial sampling variance term, Va\,hi> for all SNPs, i 
(i = l...n), on arrays 1 and 2: 

1 " 2 ~ 

var [epooling-l) = ^ y~l [(pal.i - hl.i) - Val.hl.i]/2 

i=l 

Vai.bi is calculated using the usual equation for bino- 
mial sampling variance: 

Val,hl,i = - Pal,i)/Nal - Pb2,i(l - Pbl,i)/Nh2 

The random binomial sampling variance terms 
accounts for the additional component of variation aris- 
ing from the comparison of non-identical pools. It is 
assumed that the two DNA pools are constructed from 
samples drawn from the same population, and although 
in fact it is often a case and control being compared 
(where we specifically look for differences in allele fre- 
quency), for most SNPs on an array this is a valid 
assumption [15]. 

Figure 1 visually summarizes the three types of pair- 
wise arrays comparisons used in this report, including 
the sources of error in each comparison. When com- 
paring arrays used to allelotype the same DNA pool 
(henceforth referred to as 'Type A' comparisons), the 
variation observed can only arise due to the arrays, 
giving an estimate of array variance. When comparing 
arrays used to allelotype replicate DNA pools (hence- 
forth referred to as 'Type B' comparisons), the 




Figure 1 Overview of the pair-wise array comparison's 
performed in this study. Step / depicts the construction of three 
DNA pools. The first two pools (orange and red) are constructed 
using the same DNA samples and are pool-construction replicates. 
The third pool (green) is constructed using difference DNA samples. 
Step 2 indicates allelotyping on lllumina SNP arrays, where the two 
arrays allelotyping the orange pool are array replicate. Step 3 shows 
the three types of pair-wise SNP array comparisons that can be 
made, along with the sources of error that account for differences 
in allele frequency estimates on the paired arrays. For Type A 
comparisons, the arrays being compared were used to allelotype 
the exact same DNA pool; hence, the only source of variation is the 
array. For Type B comparisons, the arrays paired were used to 
allelotype independently constructed but identical pools; thus, 
variation may arise due to the array and the pool-construction 
process. For Type C comparisons, the arrays paired were used to 
allelotype completely independent DNA pools, and variation may 
be due to the array, pool-construction, or binomial sampling 
(assuming both pools are independent samples from a single 
population). 

variation observed is due to the arrays and pool-con- 
struction, giving a direct estimate of pooling variance. 
Pool-construction variance is then calculated by sub- 
tracting the array variance (Type A) from the pooling 
variance (Type B). If replicate DNA pools have not 
been constructed, as is the case for many of the pools 
in our data set, we are still able to estimate the pooling 
variance by comparing non-identical pools (henceforth 
referred to as 'Type C comparison) and account for 
the additional binomial sampling variance term that 
arises in this case. Pool-construction variance is then 
calculated by subtracting Type A values from Type C 
values. 
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A number of assumptions are made in this analysis. 
We assume that the array variance is comparable across 
the DNA pools in an experiment, and that the average 
array variance is the best estimate. For arrays with larger 
than average array variance, perhaps caused greater var- 
iation in PCR amplification steps and/or measurement 
of allele frequency (detection of red and green fluores- 
cence), array variance will be underestimated; arrays 
with smaller than average array variance will be overesti- 
mated. It is known that SNPs with smaller minor allele 
frequencies are estimates with a greater margin of error, 
i.e. var(earray) is not constant for all SNPs. For SNP with 
a small minor allele frequency, average array variance 
will underestimate the array variance. We also assume 
that the pooling variance is constant across all SNPs, 
and that unequal amplification and/or hybridization of 
alleles (A or B) will have a negligible effect on results. 
Because our analysis is based upon contrasting array 
data from two DNA pools, the effects of unequal hybri- 
dization should largely cancel out [15,18]. 

PoolingPlanner Theory 

In choosing to conduct a pool-based GWAS, one 
accepts a loss in power relative to a conventional 
GWAS. How much power is lost can be expressed in 
terms of the effective sample size (A'^*) resulting from 
pooling A'^ individuals [4]. PoolingPlanner uses an esti- 
mate of var(epooiing) to calculate the effective sample size 
of a DNA pool. N* and var(epooiing) are related through 
two expressions for relative sample size (RSS) [defined 
in 19]: 



RSS 



(Vj + var {epooiing)) 



(2) 



(3) 



In one, the RSS of a DNA pool is expressed as the 
ratio of effective sample size to the actual sample size 
(AO- In two, it is expressed as the fraction of the total 
variance, (Vs + var(epooiing))i explained by the binomial 
sampling variance, Vj. is calculated as p{l-p)/2N, 
where p is the average minor allele frequency on the 
array, and N is number of individuals contributing to 
the DNA pool. If DNA pools have been constructed in 
replicate we let var(epooiing)= var(epooiing-i). otherwise we 
let var(epooiing)= var(epooiing_2)- The two equations for 
RSS can then be equated and solved for N*. It is worth 
noting that because our calculation of RSS relies on our 
empirical estimates of var(epooiing) (Equation 2), esti- 
mates which are based on contrasting allele frequencies 
in two DNA pools, the effects of unequal hybridization, 
which would typically thwart a direct comparison of a 



pooling-based and conventional genotyping experiment, 
cancels out (15, 18). 

Replicate arrays can be used to reduce var(epooiing) by 
a factor of l/k, where k is the number of replicate arrays 
[4]. In making var(epooiing) smaller the RSS and N' 
become larger. Effective sample size can then be used 
with one of the available power calculators, for example 
CaTS [20] or Quanto [21] to perform pool-adjusted 
power calculations [4]. PoolingPlanner is intended to 
help first time users plan a DNA pooling experiment, 
and our empirical estimates of array variance and pool 
construction variance are supplied as the default setting 
for the program for this reason. Users with their own 
estimates of variances can provide these to the program 
as well. PoolingPlanner is available at http://www.kchew. 
ca/PoolingPlanner/). 

Results 

In our analyses we encountered beads with negative 
intensity values in the red, green, or both channels. The 
number of negative beads varied by strip and typically 
affected 1-10% beads, a pattern consistently seen across 
all arrays. This can occur due to local background inten- 
sity removal at the point of image processing [23]. 
These beads were removed from our variance calcula- 
tions. Furthermore, beads with zero in both the red and 
green channels were considered failed beads and also 
dropped from our analysis. There were typically fewer 
than 100 of these per strip. Finally, SNPs having fewer 
than four bead observations were excluded. The ratio- 
nale for this was that SNPs having fewer than four 
beads observation would have poorly estimated allele 
frequency. 

Array Variance or var(earray): Type A comparisons 

We estimate array variance by comparing replicate 
arrays. Type A comparison in Figure 1, for three types 
of lUumina beadarrays, the IM-Single, the IM-Duo, and 
the 660-Quad. The results for normalized and raw data 
are given in Table 1, and box plots in Figure 2 provide a 
visual summary of the estimates. Clearly normalization 
dramatically reduces the range of observed array var- 
iance estimates for all array types. As well, normaliza- 
tion reduced the mean array variance estimate 
approximately 2.5-fold for the IM-Duo arrays and 
approximately 8-fold for the IM-Single and 660-Quad 
arrays. For normalized data most estimates of array var- 
iance, regardless of array type, fell between 2.5 x 10' 
and 5.0 x lO'''. 

For the IM-Single arrays 12 DNA pools were allelo- 
typed using 24 arrays (2 arrays per pool), yielding 12 
estimates of array variance, the mean of which was 3.8 
X 10'* (normalized) and 2.9 x 10' (raw data), see Table 
1. For the IM-Duo array 8 DNA pools were analyzed 
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Table 1 Estimates of array variance, var(earray)' for three lllumina arrays types for normalized and raw data. 







1 M-^innlp 


1 M-Diif) 

1 IVI L^\A\J 


fifiO-Ciiiaci 


Normalized data Var(ea„ay) (Range) 


(2.2 


3.8 X lO'" 
K 10"" - 6.6 X 10"") 


3.2 X 1 0""* 
(1.6 X 10"" - 6.3 X 10"") 


3.3 X 10"" 
(2.5 X 10"" - 4.9 X 10"") 


Raw data Var(earray) (Range) 


(3.0 


2.9 X 1 0"^ 
K 1 0"" - 9.2 X 1 0"^) 


9.0 X 1 0 " 
(1.7 X 10"" - 4.3 X 10"^) 


2.7 X 10"^ 
(2.0 X 10"^ - 3.0 X 10"^) 


Number of pools 




12 


8 


7 


Number of comparisons, var(earray)'^' 




12 


45(2) 


360 


Number of arrays 
(arrays/pool) 




24 (2/pool) 


32 (4/pool) 


72 (6 or 1 2/pool) 



^Each paired array comparison is treated as an Independent estimate of array variance, the average of which is reported in this table. 

^One array, in all 3 comparisons in which it was involved, produced extreme outlier var(earray) values and was removed from all analysis; hence, there are 45 
instead of 48 var(earray) for the IM-Duo arrays. 



on 32 arrays (4 arrays per pool), yielding 48 estimates of 
var(earray). Three of these estimates, each from pair-wise 
array comparisons involving the same array, were 
extreme outliers in both the normalized and raw dataset 
(see Figure 3). This array was determined faulty (see dis- 
cussion) and removed from further analysis. For the 
remaining 45 estimates the mean var(earfay) was 3.2 x 
10' (normalized) and 9.0 x 10' (raw data), see Table 1. 
Unlike the data for the IM-Single arrays, the IM-Duo 
array data spanned two batches of genotyping, carried 
out at two different times. To look for batch effects the 
IM-Duo data was also analyzed stratified by batch. The 
mean array variance was significantly different between 
batches for normalized data but not raw data (based on 
non-overlapping confidence intervals constructed 
assuming a normal distribution). Batch 1 (18 var(earray)) 
and batch 2 (27 var(earray)) had mean estimates of array 
variance of 4.2 x 10'* and 2.6 x 10'*, respectively. For 



Ja 0.004 

> 



normalized 



^ X 



Figure 2 Box plots of array variance for three lllumina array 

types. Box plots of var(earray(x,y)) for lllumina IM-Duo, IM-Single, and 
660-Quad arrays for normalized and raw data. The IM-Duo arrays 
were genotyped In two batches and are plotted stratified by batch 
("IM-Duo-Batch 1", "1 M-Duo-Batch 2"), as well as by array type "IM- 
Duo". The number of var(earray) estimates for each array type is: IM- 
Duo, n = 45; IM-Duo-Batch 1, n = 18; IM-Duo-Batch 2, n = 27; IM- 
Single, n = 11; 660-Quad, n = 360. Box plot whiskers are plotted at 
the lowest datum within 1.5 the interquartile range of the lower 
quartlle, and the highest datum within 1.5 the interquartile range of 
the upper quartlle. 



the 660-Quad arrays, 7 pools were assayed using 72 
arrays (6 or 12 arrays per pool), and mean array var- 
iance was 3.3 X 10'* for normalized data, and 2.7 x 10' 
for raw data, see Table 1. 

Pooling Variance or var(epooiing): Type B and C 
comparisons 

We estimate pool-construction variance for 27 DNA 
pools, discussed in order by lllumina array type. Six 
pools were allelotyped on the IM-Single array, and for 
each, pools were constructed in replicate and allelotyped 
by two arrays. This allowed us to calculate and compare 
pooling variance and pool-construction variance 




0.000 - 



normalized 



Figure 3 Box plots of array variance for lllumina IM-Duo 
arrays highlighting extreme outliers. Box plots of \/at{e^„^y) 
estimates (n = 48) for the IM-Duo arrays (Batch 1 and 2 combined) 
highlighting the three extreme outlier estimates in both normalized 
and raw data, all attributable to one array. This array was 
determined faulty (see discussion) and removed from all analyses. 
Box plot whiskers are plotted at the lowest datum within 1.5 the 
interquartile range of the lower quartlle, and the highest datum 
within 1.5 the interquartile range of the upper quartlle. 
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estimates as calculated using Type B and Type C com- 
parison values. Figure 4 summarizes the var(epooiing) and 
var(econstruction) estimates for those pools on the IM-Sin- 
gle array. For normalized data var(epooiing_i) ranged from 
3.2 X lO * to 5.5 X 10 * and averag ed 4.0 X 10 *. In com- 
parison var(epooiing_2) ranged from 3.5 x 10'* to 7.0 x 10' 
* and averaged 4.8 x 10 *. Var(et.onstruction-i) ranged from 
0 to 6.7 X 10'^ and had a mean of 2.9 x 10'^ (where 
negative values have been set to zero). Thus, for these 
pools var(e(.onstruction-i) accouuts for between 0 and 20%, 
or an average 7.5% of the pooling variance when using 
Type B derived values (see Additional File 2, Table S2 
for all values). Var(econstruction-2) ranged from 0 to 3.2 x 
10 * and averaged 1.0 x 10'*; thus, pool-construction 
variance accounted for between zero and 46%, or an 
average 20% of the pooling variance using Type C 
derived values (Additional File 2, Table S2). There does 
not appear to be any correlation between pool size and 
pool-construction variance, see Figure 4. 

Using raw data, estimates of var(epooiing_i) were 
approximately 8-fold higher than the normalized data. 
Estimates of var(econstruction-i) tended to be higher as 
well, averaging -20% of the pooling variance. Var(epooi- 
ing-2) estimates followed the same pattern, larger esti- 
mates of pooling variance and pool-construction 
variance (data not shown). 

Pools allelotyped on the IM-Duo and 660-Quad arrays 
were not constructed twice; hence, for these we esti- 
mated pool-construction variance based on Type C 
comparisons only. Seven DNA pools were allelotyped on 
the 660-Quad array, two using six replicate arrays (396 
estimates of var(epooiing_2) each), and five using twelve 
replicate arrays (720 estimates of var(epooiing-2) per pool. 
Figure 5 summarizes the var(epooiing_2) and var(e(.onstruc- 
tion-2) estimates for these pools (normalized data). Var 
(epoo!mg-2) estimates ranged from 4.3 x 10 * to 5.7 x 10' 
*, and averaged 5.1 x 10'*; meanwhile, the var 



(^construction s) estimates ranged from 1.0 x 10 * (23%) to 
2.4 X 10'* (42%) and averaged 1.9 x 10'* (35%). These 
estimates of pooling variance are very similar to those 
seen for pools on the IM-Single array; however, the esti- 
mates of pool-construction variance are higher (see 
Additional File 3, Table S3 for all values). For the raw 
data var(epooiing-2) estimates ranged from 2.6 x 10'^ to 
2.9 X 10'^, and averaged 2.7 x 10'^; meanwhile, the 
matched var(econstruction-2) estimates ranged from 0 to 
2.6 X 10'* (9%) and averaged 1.9 x 10'* (2%). 

IM-Duo arrays were analyzed separately by batch using 
batch-specific estimate of array variance for normalized 
data. The IM-Duo batch 1 data contained three DNA 
pools, each allelotyped by four replicate arrays; therefore, 
each var(epooiing-2) estimate is the average of 32 pair-wise 
array comparisons. Figure 6 summarizes var(epooiing-2) 
and var(econstruction-2) estimates for these pools (normal- 
ized data). Var(epooiing-2) was estimated at 5.6 x 10'*, 6.0 
X 10'* and 6.1 x 10'*. The matched var(econstruction-2) esti- 
mates were 1.5 x 10'*, 1.8 x 10'*, and 1.9 x 10 * , or 26%, 
31%, and 32% of the pooling variance for pools sized 122, 
246, and 121 (see Additional File 3, Table S3 for values). 
These values reflect those seen for pools on 660-Quad 
and IM-Single arrays. In comparison, the IM-Duo batch 
2 data deviated dramatically. This batch contained 5 
pools, each also alleloyped by four replicate arrays. For 
these var(epooiing-2) ranged from 1.8 x 10'^ to 3.7 x 10'^, 
and averaged 2.6 x 10'^, and var(econstruction-2) estimates 
ranging from 7.9 x 10'* (43%) to 2.7 x 10'^ (72%) (see 
Additional File 3, Table S3). For these pools the esti- 
mates of pooling variance are nearly 2-3 fold higher than 
those of batch 1 but the array variance remained low at 
2.4 X 10'*, leading to high estimates of pool-construction 
variance (see discussion). For raw data batch 1 & 2 were 
analyzed combined using all possible array comparisons 
and var(earray) = 9.0 x 10'*. Estimates of var(epooiing_2) 
ranged from 2.2 x 10'^ to 5.4 x 10'^ and averaged 3.4 x 
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Figure 4 Decomposition of pooling variance for lllumina liVI-Single arrays. Stacked barplots showing the normalized pooling variance 
estimates, and the breakdown into array and to pool-construction variance for pools allelotyped on the lllumina IM-Single array. Estimates 
derived from comparison of replicate pools are labeled "B". Estimates derived from comparison of non-identical pools are labeled "C," and "C2" 
(specifying replicate pool). The portion of pooling variance attributed to pool-construction is indicated by hatched bars, and array variance by 
black or grey bars. Pool size is shown above the barplots. 



Earp ef al. BMC Medical Genomics 201 1, 4:81 
http://www.biomedcentral.eom/1755-8794/4/81 



Page 8 of 13 



I 

O 
O 

u 



0.0008 
0.0007 H 
0.0006 
0.0005 
0.0004 
0.0003 - 
0.0002 - 
0.0001 
0.0000 



I Anay vaiiance S Pool-constmction vaiiance 



75 



84 



I 



176 



303 



222 272 



II 
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hatched bars, the portion of pooling variance attribute to the array is indicated by grey bars. Pool size is indicated above each stacked bar. 



10'^. Var(econstruction-2) estimates averaged at 51% of the 
calculated var(epooiing-2)- 

PoolingPlanner Example 

To demonstrate how to use PoolingPlanner we consider 
a hypothetical scenario. A researcher has a collection of 
samples including 300 cases and 1000 controls and 
wants to conduct a pool-based GWAS. The researcher 
needs to decide how many arrays to use, and wants to 
construct power curves that take into consideration the 
power loss concomitant with this cost-efficient strategy. 
They plan on using Illumina's 660-Quad array and nor- 
malizing their data. PoolingPlanner is used to calculate 
the effective sample size of each DNA pool using four 
input values: 1) var(earray), 2) var(econstruction), 3) pool 
size, and 4) allele frequency. Figure 7 A shows the Poo- 
lingPlanner input panel for the case pool; Figure 7B the 



input panel for the control pool. PoolingPlanner will 
supply the var(earray) value as calculated based on our 
660-Quad normalized data, 3.3 x 10'*, see Table 2. 
Alternatively, the user may specify a custom value. In 
this example we assume var(econstruction) is 30% of the 
pooling variance, chosen to reflect values we observed. 
Var(e(,onstruction) IS entered into PoolingPlanner by speci- 
fying "Array:Construction Ratio = 7:3", as seen in Figure 
7A and 7B. An exact value for var(econstruction) can also 
be entered (30% of 3.3 x 10 * would be 9.9 x 10'^). For 
allele frequency, by default PoolingPlanner uses HapMap 
CEU data (release 27) to set p to the average minor 
allele frequency (MAF) on the IM-Single, IM-Duo, or 
660-Quad lllumina array. For the IM-Single and IM- 
Duo arrays p = 0.21 (>95% of SNPs had available Hap- 
Map data), and for the 660-Quad array p = 0.29 (87% of 
SNPs had available HapMap data). Estimates of p based 
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Figure 6 Decomposition of pooling variance for lllumina IIVl-Duo arrays. Stacked barplots showing the normalized pooling variance 
estimates, and the breakdown into array and to pool-construction variance for pools allelotyped on the lllumina lIVl-Duo array. All estimates are 
derived from comparison of non-identical pools. Type C. The portion of pooling variance attributed to pool-construction is indicated by hatched 
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N* 




1 


ti.l79 


179.246 




2 


0.304 


304.001 




3 


0.396 


395.835 




4 


0.466 


466.259 




5 


0.522 


521.979 




6 


0.567 


567.166 




7 


1.605 


604.547 




e 


1.636 


635.985 




9 


1.663 


662.792 




10 


0.686 


685.922 




11 


1.706 


706.082 




12 


1.724 


723.811 




13 


0.74 


739.522 





0.95 
0.90 
0.85 
0.80 
0.75 
0.70 
0.65 

NI 

'55 0.60 
V 

Q. 0.55 
I 0-50 

I 0.40 
^ 0.35 
0.30 
0.25 
0.20 
0.15 
0.10 
0.05 
0.00 




2.5 



5.0 



7.5 10.0 12.5 15.0 
» of Arrays 



17.5 



Figure 7 PoolingPlanner. (A) Control input and output panel for the case pool. (B) Control input and output panel for the control pool. (C) 
Corresponding plot of relative sample size versus the number if replicate arrays used in allelotyping the case (blue line) and control pool (red 
ine). 



on our pooled array data were similar (see Additional 
File 4, Table S4). In this example the average MAF is 
set to 0.29, but the user can enter any value between 0 
and 0.5. Once these values are entered the program cal- 
culates the relative and effective sample size of each 
DNA pool for a range of replicate array values, and pro- 
vides a corresponding table of values as seen in Figure 
7A and 7B. A plot of relative sample size versus number 
of replicate arrays is also automatically generated. For a 
DNA pool containing 300 individuals (blue line in Fig- 
ure 7C), an RSS of 80% is achieved with 6 arrays (N* is 



244) while an RSS of 90% requires 13 arrays (N* is 271). 
In contrast, for a pool of 1000 individuals (red line in 
Figure 7C), an RSS of 80% is achieved with 19 arrays 
(N* is 806). This plot makes it easy to see at what point 
additional replicate arrays begin to yield diminishing 
returns in terms of increasing the effective sample size 
of a DNA pool. 

To perform pooling-adjusted power calculations, a 
pool's effective sample size, output by PoolingPlanner, is 
entered into a power calculator. We have used Quanto 
[21] for this example. Assuming an unmatched case- 



Table 2 Impact of replicate arrays on effective sample size (N*) and minimum detectable odds ratio (MDOR) in 
pooling-GWAS. 



Arrays per pool 


Case pool 
(RSS, N*) 


Control pool 
(RSS, N*) 


MDOR at 80% (p = 0.29) 


MDOR at 80% (p = 0.10) 


24 


0.95, 284 


0.84, 837 


1.33 


1.51 


12 


0.90, 259 


0.72, 720 


1.35 


1.54 


6 


0.81, 244 


0.56, 562 


1.38 


1.58 


3 


0.69, 205 


0.39, 391 


1.44 


1.70 


individual Genotyping 


1, 300 


1, 1000 


1.32 


1.49 



This table compares the minimum detectable odds ratios (MDOR) at 80% power for a theoretical pooling experiment with 300 cases and 1000 controls, given a 
DNA-pooling strategy where 24, 12, 6, or 3 lllumina 660-Quad replicate arrays are used to allelotype each DNA pool (case and control). The equivalent individual 
genotyping experiment is given for reference. Relative sample size (RSS) and effective sample size (N*) are generated by PoolingPlanner assuming var(earray)= 3.3 
X 10"^ var(econstruction)= 9.9 X 10 "^ and an average minor allele frequency of 0.29. MDOR at 80% power were calculated using Quanto [21] assuming an 
unmatched case-control design testing for gene-only effects using a log-additive model, where the incidence of the case phenotype is 0.02% and the risk allele, 
p, is set to 0.29 or 0.10. 
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control design testing for gene-only effects using a log- 
additive model, where the incidence of the case pheno- 
type is 0.02%, and the risk allele frequency (pdsk) is 29% 
(and in complete linkage disequilibrium with a SNP on 
the array), the power curves corresponding to a pooling 
experiment where 3, 6, 12, or 24 lUumina 660-Quad 
replicate arrays are used per pool is given in Figure 8. 
The power curve for individual genotyping is also 
plotted for reference. Table 2 accompanies this Figure 8 
and gives the minimum detectable odds ratio (MDOR) 
at 80% power for each curve when p^sk is 0.29, and for 
comparison, when p^isk is 0.1. Assuming individual gen- 
otyping, the MDOR at 80% power would be 1.32 when 
Prisk is 0.29. Using 24 arrays per pool this value rises 
incrementally to 1.33. Using 12, 6, or 3 arrays per pool, 
the MDOR's further increase to 1.35, 1.38, and 1.44, 
respectively. Only when 3 arrays are used per pool does 
the MDOR dramatically differ between pooling and 
individual genotyping. Marginal improvements in 
MDOR should be considered in light of increasing 
experimental cost, and the percent cost of a pooling 
GWAS relative to a conventional GWAS is given in 
Table 2 to highlight this difference. If arrays cost $250, 
the ability to detect an odds ratio of 1.38 with 80% 

/ \ 
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d ° 


3 arrays / pool 
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Odds Ratio 

Figure 8 Example use of PoolingPlanner Power curves for a 
theoretical pooling experiment with 300 cases and 1000 controls 
where 24, 12, 6, or 3 lllumina 660-Quad replicate arrays are used to 
allelotype the DNA pools. The equivalent individual genotyping 
experiment is given for reference. Effective sample size assuming 24, 
12, 6, or 3 arrays was calculated using PoolingPlanner (see Table 2) 
and these values entered into Quanto [21] to obtain pool-adjusted 
estimates of power over a range of odds ratios. Calculations are 
based on an unmatched case-control design testing for gene-only 
effects using a log-additive model, where the incidence of the case 
phenotype is 0.02%, and the risk allele frequency (pnsk) i5 29% (and 
in complete linkage disequilibrium with a SNP on the array). A 
dashed line is draw to indicate the 80% power threshold. 



power would cost $3,000 (6 arrays per pool), while the 
ability to detect an odds ratio of 1.33 would be $325,000 
(individual genotyping). In many cases, particularly for 
phenotypes suggestive of moderate to large odds ratio, 
this difference in detectable odds ratios will not change 
of the overall outcome of the association study. In a 
pooling GWAS, as in conventional GWAS, for rarer risk 
alleles we have less power to detect associations, see the 
MDOR in Table 2 when Prisk is 0.1. We note that as 
Prisk gets smaller, the difference in the MDOR for a 
pooling versus individual genotyping experiment 
becomes more noticeable. For example, when 6 replicate 
arrays are used per pool and Prisk is 0.29, the MDOR 
differs by 0.06 from individual genotyping, but this dif- 
ference becomes 0.09 when p^sk is 0.1. It is also worth 
noting in Table 2 that using the same number of repli- 
cate arrays on different sized DNA pools of very differ- 
ent RSS values. Contrary to what might be expected, the 
maximally powered pool-based experiment occurs when 
arrays are equally distributed amongst pools, regardless 
of differences in pool size and RSS, assuming the pool- 
construction variance is constant (see Additional File 5, 
Table S5 & Additional File 6, Figure SI). By conducting 
an analysis such as this a user can decide what power is 
forfeited by conducting a pool-based GWAS, and decide 
whether the approach makes practical sense in their 
situation. 

Discussion 

In the first part of this study we set out to establish a 
range of experimentally observed values for array var- 
iance on Illumina's SNP-genotyping beadarrays. At the 
same time, we wanted to establish a range of values for 
pool construction variance. In the second part, we used 
these estimates to calculate the effective sample size of a 
DNA pool given a range of replicate array values, and 
provide an online tool to allow readers to do the same. 

At the time of our analysis we were aware of only one 
report that estimated array variance (var(earray)= 1-1 x 
10 * ) for an lllumina HumanHap300 beadarray [18]. 
lllumina has since released higher density arrays (>1 
million SNPs per array), and we wanted to determine if 
increased SNP density negatively impacted array var- 
iance. Overall, we found this was not the case. All of the 
lllumina array types examined here (660-Quad, IM-Sin- 
gle, IM-Duo) had very similar var(earray) estimates, cen- 
tering around 3 x 10 * for our normalized data, which is 
largely in keeping with the HumanHap300 result [18]. 
We expect this result would extend to the Huma- 
nOmnil-Quad array, although it was not analyzed it 
here. We found that the normalization procedure we 
used reduced the array variance between 2-8-fold, and a 
newly reported normalization algorithm suggests that 
array variance can be reduced even further [24]. 
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Reduced array variance should mean more precise esti- 
mates of allele frequency, which should further mini- 
mize the loss of power associated with using the DNA 
pooling strategy. 

The lUumina arrays analyzed here yielded var(earray) 
estimates ~ 10-fold smaller than those of the Affymetrix 
Hindlll 50K arrays (var(earray)= 1-26 x 10' ) analyzed by 
MacGregor [17]. A similar result was noted when Affy- 
metrix arrays were compared to Illumina Human- 
HapSOO arrays [18]. In part, this may be explained by 
differences in the manufacturing of the arrays. MacGre- 
gor et al. [18] report that pooling errors appear to be 
highly related to number of probes used to estimate 
SNP allele frequency. While 10 probe pairs are assigned 
to each SNP on the Affymetrix Hindlll 50K arrays [18], 
on average 16-18 beads are used on the Illumina arrays. 
Further, on Illumina arrays beads are randomly dis- 
persed on a slide [22], while on Affymetrix arrays probes 
are fixed in a given location, making the latter more sus- 
ceptible to location-specific technical errors. As the 
array variance gets smaller (i.e. when using Illumina 
arrays), we expect the pool-construction variance to 
account for a greater proportion of the pooling variance. 

Our estimates of var(econstruction) spanned 27 DNA 
pools, ranging in size from 74 to 446 individual samples, 
allowing us to sample a range of possible pool construc- 
tion variances. First, in contrast to a previous report 
[25], we did not observe a relationship between pool 
size and pool-construction variance. We did, however, 
observe batch effects. For the IM-Duo arrays, which 
were processed in two batches on different dates, we 
observed very different estimates of pooling variance 
and pool-construction variance (see Figure 6). Most of 
our estimates of pool-construction variance were based 
on values from Type C comparisons, and for these var 
(Cconstruction) usually fell between 20 and 40% of the 
pooling variance. When calculations were based on the 
comparison of replicate DNA pools (Type B compari- 
sons, IM-Single arrays only) our estimates were smaller, 
on average 7.5% of the pooling variance. There are sev- 
eral possible reasons for this. The adjustment for bino- 
mial sampling variance may not fully account for the 
variance arising from sampling, leaving variance that is 
then attributed to pool-construction in the Type C com- 
parisons. As well, some estimates of pool-construction 
variance were negative, and these were set to zero, 
which would lead to overestimation of pool-construction 
variance. We conclude that relative to var(earray)> var 
(econstruction.) IS of less importance; however, our results 
suggest pool construction may account for more of the 
pooling variance than previously estimated [17]. Mac- 
Gregor [17] attributed 12.5% of the pooling variance to 
pool-construction when using Affymetrix Hindlll 50K 
arrays. On average we attribute 30% of pooling variance 



to pool construction when using Illumina arrays. This 
difference is what might be expected given the smaller 
var(earray) for Illumina arrays. Further reductions in 
array variance, for example, through improved normali- 
zation of array data, have the potential to further shift 
the proportion of an experiment's pooling variance that 
is attributed to pool-construction errors. 

With respect to the design of pool-based experiments 
when using Illumina arrays, our partitioning of the pool- 
ing variance still suggests [17] that constructing fewer 
(large) pools while using more replicate arrays (i.e. target 
array variance), is the most effective way to reduce pool- 
ing variance and conduct the most efficient pool-based 
GWAS. Further, for an equivalent pool-based experi- 
ment using Affymetrix arrays in place of Illumina arrays, 
more array replicates will be needed (~10-fold more). As 
the proportion of array variance to pool construction 
variance approaches 50:50, strategies to reduce pool 
construction variance become more important. 

For one of our experiments, IM-Duo Batch 2, we 
observed unusually high estimates of pool-construction 
variance and low estimates of array variance (see Figure 
6). In this experiment, pool replicates were allelotyped 
on the same physical array (which holds two samples). 
Subsequently, we noticed that the array variance for 
replicates on the same chip were much smaller than the 
variance for replicates on different chips. Overall, this 
led to the array variance being underestimated relative 
to the pooling variance, leaving more variance to be 
accounted for by pool construction. In addition, the 
between-chip variance for these arrays was much higher 
than observed in the IM-Duo Batch 1 dataset, which 
lead to large estimates of pooling and pool- construction 
variance overall. Ultimately, this was traced back to unu- 
sually high red channel intensity on some arrays, despite 
normalization, which biased allele frequency estimates 
array-wide. Clearly this will influence any downstream 
association analysis, so in this case, our analysis of var- 
iance served to flag a serious problem in the array data. 
It also highlighted the need to randomize DNA pool 
replicates among arrays that carry more than one sam- 
ple, and to randomize by location on the array, particu- 
larly in the case of the 660-Quad and HumanOmnil- 
Quad arrays, which carry four samples. 

The differences between IM-Duo Batch 1 and 2 data 
were significant for normalized data, but not raw data. 
On one hand, it may be that greater noise associated 
with the raw data prevented differences in array variance 
and pool construction variance from being significant. 
On the other, it is possible that the normalization proce- 
dure itself exacerbated technical artifacts only present 
on some arrays, leading to the observed differences in 
normalized data. This can occur if technical artefacts 
violate the assumptions of the normalization [26]. 
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Conclusions 

We have provided empirical estimates of var(earray) and 
var(econstruction) for a range of DNA pool sizes. We have 
also presented PoolingPlanner, a simple program to help 
translate these variances into their effect on sample size, 
information that can then be use in a power calculator 
to conduct pool-adjust calculations. PoolingPlanner may 
be helpful in quickly assessing theoretical best and 
worst-case scenarios for a DNA pooling GWAS. With 
this information the user can then make a more 
informed decision about how to carry out their pooling 
experiment to optimally balance cost with loss of power. 
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